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Abstract 

The coherence of a random matrix, which is defined to be the largest magnitude of 
the Pearson correlation coefficients between the columns of the random matrix, is an 
important quantity for a wide range of applications including high-dimensional statistics 
and signal processing. Inspired by these applications, this paper studies the limiting 
laws of the coherence of n x p random matrices for a full range of the dimension p with 
a special focus on the ultra high-dimensional setting. Assuming the columns of the 
random matrix arc independent random vectors with a common spherical distribution, 
we give a complete characterization of the behavior of the limiting distributions of the 
coherence. More specifically, the limiting distributions of the coherence are derived 
separately for three regimes: ^logp — >■ 0, ^ logp — >■ /3 € (0, oo), and ^ logp oo. 
The results show that the limiting behavior of the coherence differs significantly in 
different regimes and exhibits interesting phase transition phenomena as the dimension 
p grows as a function of n. Applications to statistics and compressed sensing in the 
ultra high-dimensional setting are also discussed. 
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1 Introduction 



With dramatic advances in computing and technology, large and high-dimensional datasets 
are now routinely collected in many scientific investigations. The associated statistical in- 
ference problems, where the dimension p can be much larger than the sample size n, arise 
naturally in a wide range of applications including compressed sensing, climate studies, 
genomics, functional magnetic resonance imaging, risk management and portfolio alloca- 
tion. Conventional statistical methods and results based on fixed p and large n are no 
longer applicable and these applications call for new technical tools and new statistical 
procedures. 

The coherence of a random matrix, which is defined to be the largest magnitude of the 
off-diagonal entries of the sample correlation matrix generated from the random matrix, has 
been shown to be an important quantity for many applications. For example, the coherence 
has been used for testing the covariance structure of high-dimensional distributions (Cai 
and Jiang (2010)), the construction of compressed sensing matrices and high dimensional 
regression in statistics (see, e.g., Candes and Tao (2005), Donoho, Elad and Temlyakov 
(2006) and Cai, Wang and Xu (2010a, b)). In addition, the coherence has also been used in 
signal processing, medical imaging, and seismology. Some of these problems are seemingly 
unrelated at first sight, but interestingly they can all be attacked through the use of the 
limiting laws of the coherence of random matrices (see, e.g., Cai and Jiang (2010)). In 
these applications, a case of special interest is when the dimension p is much larger than 
the sample size n. Indeed, in compressed sensing and other related problems the goal is 
often to make the dimension p as large as possible relative to the sample size n. 

In the present paper we study the limiting laws of the coherence of random matrices. 
Let X = (xi, • • • ,x„)^ G M" and y = (yi, • • • ,yn)'^ G I^"- Recall the Pearson correlation 
coefficient p defined by 



where x = ^^^=iXi and y = ^X^iLi^i- Let Xi,--- ,Xp be independent n-dimensional 
random vectors, and let pij be the correlation coefficient between Xj and Xj. Set X = 
(Xi, • • • , Xp) = {xij)nxp- The coherence of the random matrix X is defined as 

Ln= max \pij\. (2) 

i<fi<j<p 

In certain applications such as the construction of compressed sensing matrices, the means 
Pi = EX.i and pj = EX.j are given and one is interested in 





Pij 




I <i,j <P 



(3) 



-^i ~ ■ W^j ~ pj 



2 



and the corresponding coherence is defined by 

Ln= max \piA. (4) 

i<i<j<p 

The goal of this paper is to give a complete characterization of the behavior of the limiting 
distributions of L„ and L„ over the full range of p (as a function of n) including the super- 
exponential case where (log p)/n — )■ oo. 

The coherence L„ has been studied intensively in recent years. Jiang (2004) was the first 
to show that if independent and identically distributed (i.i.d.) with E\xij\^^~^'^ < oo 

for some e > and n/p — t- 7 G (0, oo), then nL^ — 41ogp + log log p converges weakly to an 
extreme distribution of type I with distribution function 

F{y) = e"vfc''~'"", y G M. (5) 

Throughout this paper, log x = logg x for any x > and p = p.„ depends on n only. The 
result ([5]) was later improved in several papers by sharpening the moment assumptions and 
relaxing the restrictions between n and p. In terms of the relationship between n and p, 
these results can be classified into the following categories: 

(a) . Linear rate: p ^ cn with c being a constant. Li and Rosalsky (2006), Zhou (2007), 

Li, Liu and Rosalsky (2009) and Li, Qi and Rosalsky (2010) improved the moment 
conditions to make ^ valid under the condition p/n ^ c £ {0,1). 

(b) . Polynomial rate: p = 0{n°^) with a > being a constant. Liu, Lin and Shao (2008) 

showed that ([5]) holds as p — t- 00 and p = 0(n") where a is a constant. That is, ([5]) 
still holds when n and p are in the polynomial rates. 

(c) . Sub- exponential rate: logp = o(n") with < a < 1/3 being a constant. Motivated 

by applications in testing high-dimensional covariance structure and construction of 
compressed sensing, Cai and Jiang (2010) further extended the range of p by consid- 
ering the sub-exponential rate. It was shown that ([5]) is also valid if logp = o(n") 
with a E (0, 1/3] and the distribution of xu is well-behaved. In particular, ([5]) holds 
with a = 1/3 when Xj^'s are i.i.d. N(0, 1) random variables. 

An interesting question is whether the limiting distribution ([5]) holds for even higher 
dimensional case when logp is of order with a > 1/3. This is a case of significant interest 
in high-dimensional data analysis and signal processing. For example, in the context of high- 
dimensional regression and classification, simulation studies about the distribution of Ln 
were made in Cai and Lv (2007) and Fan and Lv (2008 and 2010). In this paper we shall 
study the limiting laws of the coherence L„, for a full range of the values of p. To make our 
technical analysis tractable, we focus on the setting where the columns Xj of the random 
matrix X follow a spherical distribution, which contains the normal distribution A^(0, 0"^I„) 
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as a special case. Motivated by the applications in statistics and signal processing mentioned 
earlier, we are especially interested in the ultra high dimensional case. More specifically, 
we consider three different regimes: 

(i) . the suh- exponential case: j^logp ^ 0; 

(ii) . the exponential case: ^\ogp ^ /3 € (0, oo); 

(iii) . the super- exponential case: Mogp— )• oo. 

Our results show that the limiting behavior of L„ differs significantly in different regimes 
and exhibits interesting phase transition phenomena as the dimension p grows as a function 
of n. To answer the question posed earlier, it is shown that nL^— 4 log p+ log logp converges 
to the limiting distribution given in ^ if and only if logp = o(n^/^). The phase transition 
in the limiting distribution first occurs with the case when \ogp is of order n^/^. In this 
transitional case, additional shift in the limiting distribution occurs. When the dimension 
p further grows as a function of n, another transition occurs in the range when \ogp is of 
the same order as n. In the sub-exponential case, L„ converges to in probability. When 
logp ~ (3n for some positive constant /3, L„ converges in probability to a constant strictly 
between and 1, and the limiting distribution of T„ = log(l — L^) is significantly different 
from that in the sub-exponential case. If p is further increased to the super-exponential 
case, Ln converges to 1 in probability and the limiting distribution of Tn becomes the 
extreme value distribution without a shift. 

There are also interesting differences between the limiting behaviors of L„ and Ln- As 
shown in Cai and Jiang (2010), the limiting laws of L„ and L„ coincide with each other 
when iid A^(0, 1) variables and \ogp = o(n^/^). Our results show that this remains 

true in the current setting for the sub-exponential and exponential cases, but not true 
for the super-exponential case. It is interesting to contrast the results obtained in this 
paper with the results on L„ and L„ in the previous literature. The only known limiting 
distribution of and -L„ is given in ([5]) and the best known result in terms of the range of 
p is logp = o{n^/^). In comparison, our study significantly extends the knowledge on the 
limiting distributions of the coherence and shows the "colorful" phase transition phenomena 
as the dimension p increases. 

The limiting laws of the coherence have immediate applications in statistics and signal 
processing. Testing the covariance structure of a high dimensional random variable is an 
important problem in statistical inference. A particularly interesting problem is to test for 
independence in the Gaussian case because many statistical procedures are built upon the 
assumptions of independence and normality of the observations. The limiting laws of the 
coherence derived in this paper can be used directly to construct a test for independence in 
the ultra high dimensional setting. In addition, the limiting laws can also be used for the 
construction of compressed sensing matrices. We shall discuss these applications in Section 

El 
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Many sophisticated probabilistic tools have been used in the previous literature to study 
the limiting laws of the coherence. For example, the Chen-Stein method, large deviation 
inequalities, and strong approximations were used to derive the results mentioned earlier 
in (a), (b) and (c). Yet there appears to be limitations to these methods. It is unclear (to 
us) whether these techniques can be easily adopted to derive the limiting distribution of 
when logp is of order for a > 1/3 and answer the question posed earlier. See Remark 
14.11 in Section |4] for further discussions. In this paper a different technique is developed. 
Under the assumption that Xj in ([2]) has a spherical distribution, we first show a somewhat 
surprising result that the sample correlation coefficients {pij; 1 < i < j < p} are pairwise 
independent. We then apply the Chen-Stein method to the coherence Ln = maxi<j<j<p \pij\ 
by using the exact distribution of pij and the pairwise-independence structure of pij. In 
addition, the exact distribution of pij also leads to some interesting properties of pij in 
the small sample cases: pij has the symmetric Bernoulli distribution for n = 2, that is, 
P{pij = ±1) = 1/2; pfj follows the Arcsine law on [0, 1] for n = 3; pij follows the uniform 
distribution on [—1, 1] for n = 4; and pij follows the semi-circle law for n = 5. 

The rest of the paper is organized as follows. Section [2] studies the limiting laws of 
the coherence L„ and L„ of a random matrix in the high-dimensional setting under the 
three regimes. The interesting phase transition phenomena are discussed in detail. Section 
|3] considers two direct applications of the limiting laws derived in this paper to statistics 
and signal processing in the ultra high dimensional setting. Section |4] discusses some of the 
interesting aspects of the techniques used in the derivations. Connections and differences 
with other related work, for example, the relationship between the sample correlation co- 
efficients and the angles between random vectors, are discussed in Section [5j The main 
results are proved in Section [6l 

2 Limiting Laws of the Coherence 

In this section we study separately the limiting behaviors of the coherence L„ and L„ of an 
n X p random matrix X under the three regimes: ^logp — )■ 0, ^logp j3 ^ (0, oo), and 
^ logp — 7- oo. As mentioned before, we shall focus on the setting where the columns Xj of 
the random matrix X follow a spherical distribution. 

2.1 Limiting Laws of the Coherence L„ 

A random vector Y G is said to follow a spherical distribution if OY and Y have the 
same probability distribution for all n x n orthogonal matrix O. Examples of spherical 
distributions include: 

• the multivariate normal distribution iV(0, fx^In) with o" > 0; 
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the normal scale-mixutre distribution X^jL;^ ekN{0, cr'^ln) with the density function 
where ak > 0, > 0, and Ylk=i ^fc = 1; 

the multivariate t distribution with m degrees of freedom and density function 



r(f)(?n^)"/2 V m 
for m > 1. The case m = 1 corresponds to the multivariate Cauchy distribution. 

See Muirhead (1982) for further discussions on spherical distributions. 

Let X = (Xi, • • • ,Xp) = {xij)nxp be an n X p random matrix. Throughout the rest of 
this paper, we shall assume: 

Assumption (A): the columns Xi, • • • , Xp are independent n-dimensional random vectors 
with a common spherical distribution (which may depend on n) and P(Xi = 0) = 0. 

The condition P(Xi = 0) = is to ensure that the correlation coefficients are well 
defined. Let pij be the Pearson correlation coefficient of Xj and Xj for 1 < i < j < p. 
Then, ■= iPij)pxp is the correlation matrix of X, and L„ defined in ([2]), is the largest 
magnitude of the off-diagonal entries of the sample correlation matrix 

To make the statements of the limiting distributions uniform across different regimes, 
we shall state all the results in the main theorems in terms of = log(l — L^). We begin 
with the sub-exponential case. 

Theorem l (Sub-Exponential Case) Suppose p = Pn satisfies {\ogp)/n — )■ as n ^ 
oo, then under Assumption (A), 

(i) . Ln ^ in probability as n ^ oo. 

(ii) . Let Tn = log(l — L^). Then, as n ^ oo, 

nTn + 4 log p - log log p (8) 

converges weakly to an extreme distribution with the distribution function F{y) = 
1 - e-^'^"'", y^^ andK = l/y/M. 

The following law of large numbers is a direct consequence of Theorem [TJ 

Corollary 2.1 Assume the same conditions as in Theorem{l\ we have 

n 

-Ln ^ 2 (9) 



logp 



in probability as n ^ oo. 
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This result actually provides the convergence speed of L„ — t- stated in Theorem [TJi) . It is 
stronger than Theorem 2 in Cai and Jiang (2010), which shows ([9|) holds if logp = o(n^/^) 
and Xjj's are i.i.d. A^(0, 1) random variables. 

Theorem [T] also shows an interesting phase transition phenomenon of the limiting be- 
havior of the coherence L„. 

Corollary 2.2 (Transitional Case) Suppose p = Pn satisfies liuin^ooi^ogp)/ ^/n = 
a G [0,oo), then under Assumption (A), 

nL^ — 4 log p + log log p (10) 

converges weakly to the distribution function exp{ — -^e"^^"'"'^"^^/^}, y € M. 

As mentioned in the introduction, Cai and Jiang (2010) shows that nL^ — 4 log p + log logp 
converges weakly to an extreme distribution with distribution function given in ([5]) when 
logp = o{n^^^) and Xij are independent standard normal variables. This is the best known 
result in the literature in terms of the range of p. Corollary 12.21 shows that ([5]) holds if 
and only if logp = o(n^/^) when Xi has a spherical distribution which includes the normal 
distribution A^(0,/„) as a special case. This answers the question asked earlier in this 
paper. Corollary 12.21 also shows that the limiting distribution of L„ has a transitional phase 
between (log p)/^/n — t- and (log p)/y/n — t- oo. In the transitional case when (logp)/ y/n — )• 
a G (0, oo), the limiting distribution of nL^ — 4 log p + log log p is shifted to the left by 8a^. 
We now consider the exponential case. 

Theorem 2 (Exponential Case) Suppose p = Pn satisfies (log p)/n — ^ /3 G (0, oo) as 

n — )• oo, then under Assumption (A), 

(i) . Ln —7- \/l — e~^^ in probability as n ^ oo. 

(ii) . Let Tn = log(l — L^). Then, as n ^ oo, 

nT„ + 4 log p — log log p (11) 
converges weakly to the distribution function 

F{y) = 1 - exp [-K{P)e^y+^P^I^] , y G M, where K{P) = (12) 

Theorem [2] reveals the behavior of Ln in the transitional case (log p)/n — )• (3. In this case, 
the coherence L„ converges in probability to a constant strictly between and 1. Dividing 
(jlip by n, it is easy to see that 

Tn —7- —4/3 in probability as n — t- 00 
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since lim„_!.oo (log /n = /5 G (0,oo). This is also a direct consequence of Theorem [2]^i) . 
Furthermore, it is trivially true that 1 — e~^^ ~ 4/3 as /? — )• 0"^. Thus, 

lim K(/3) = -^, 
/3^o+ VSvr 

which is exactly the value of K in Theorem [TJ Thus, the limiting distribution F{y) in 
Theorem [2] as /3 — )• 0"*" becomes the limiting distribution F{y) in Theorem [TJ Heuristically, 
the sub-exponential case covered in Theorem [1] corresponds to the case "/? = 0" in Theorem 
[21 On the other hand, the exponential case of (logp)/n — t- /3 G (0, cxo) can also be viewed 
as a transitional phase between the sub-exponential and super-exponential cases. 
Finally we turn to the super-exponential case where (logp)/n — t- oo. 

Theorem 3 (Super-Exponential Case) Suppose p = pn satisfies {log p)/n oo as 
n — )• oo. Let Tn = log(l — L^). Then under Assumption (A), 

(i) . Ln ^ 1 in probability as n ^ oo. Further, ]^^Tn — )• —4 in probability as n ^ oo. 

(ii) . As n ^ oo, 

An 

nTn + ^ log p - log n (13) 

converges weakly to the distribution function F(y) = 1 — e~^'^^^^ , y G M with K = 
1/a/2^. 

The correction term of nTn in ([13]) is logp — logn, which is different from the term 
"4 log p — log log p" appeared in ([8]) and (llip . A reason is that T„ converges to a finite 
constant in probability in Theorems [T] and [21 whereas T„ goes to — oo in probability in 
Theorem [3l On the other hand, suppose (logp)/n — t- /? G (0, oo) and /3 is large, then 
logn = log log p — log/3 -|- o(l) and 

4n 8 

logp = 41ogp H logp = 41ogp -1-8/3-1- oil) 

n — 2 77,-2 

as 77, — oo. Consequently, the quantity in ([13]) becomes 

(77T„ -|- 41ogp — log log p) -|- constant -|- o(l) 

as 77 — )■ oo. The part in the parenthesis is the same as ([8]) in Theorem [Hand (|lip in Theorem 
[21 This says that, heuristically, the results in Theorems [H [21 and [3l are consistent. 

The formulation in the above theorems is in terms of T^ = log(l — L^) for uniformity. 
However, one can easily change the expressions in terms of the coherence -L„,. For instance, 

P(77log(l -L^) +41ogp-loglogp < y) = P{Ln > ^) 



where 



1 — exp <j — (— 41ogp -I- loglogp -I- y) J> . (14) 
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2.2 Limiting Laws of L„ 

We now study the limiting laws of the coherence L„, defined in ([3]) and Note that 
under Assumption (A), the columns Xi,--- ,Xp are independent n-dimensional random 
vectors with a common spherical distribution. By symmetry, it is easy to see that the mean 
fj, = = if it exists and hence 

Pij = iiv ii' ii4 II Ln= max \pij\. (15) 

||Xj|| • yXjii i<«<j<p 

As mentioned in the introduction, Cai and Jiang (2010) showed that the limiting laws 
of Ln and L„ coincide with each other when iid A^(0, 1) random variables and 

logp = o{n^^^). We shall show that this is still true in our current setting for the sub- 
exponential and exponential cases, but not true for the super-exponential case. 

Theorem 4 (Sub-Exponential & Exponential Cases) Under the same conditions, 
Theorems Ul andl^ and Corollaries \ 2. 1\ and \2.S\ hold with Ln replaced by L„. 

In the super-exponential case, the limiting behaviors of L„ and L„ are different. 

Theorem 5 (Super-Exponential Case) Suppose p = Pn satisfies {log p)/n — t- oo as 
n — )• oo. Let Tn = log(l — L'^)- Then under Assumption (A), 

(i) . Ln ^ 1 in probability as n ^ oo. Further, ^^^Tn — )• —4 in probability as n ^ oo. 

(ii) . As n ^ oo, 

An 

nTn H log p — log n (16) 

n — 1 

converges weakly to the distribution function F(y) = 1 — e~^'^^^'^ , y £ M. with K = 
l/^/2^. 

Note the difference between ()13p and (I16p . When (logp)/n — t- oo, the difference between 
\ogp and logp is not negligible. 

3 Applications 

As mentioned in the introduction, the limiting laws of the coherence have a wide range of 
applications. Here we discuss briefly two immediate applications, one in high-dimensional 
statistics and another in signal processing. These applications were also discussed in Cai 
and Jiang (2010), but restricted to the Gaussian case with logp = o(n^/^). Here we extend 
to the more general spherical distributions and higher dimensions. 

Testing the covariance structure of a distribution is an important problem in high dimen- 
sional statistical inference. Let Yi, . . . , Y„ be a random sample from a p-variate spherical 
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distribution with covariance matrix Spxp = (cij)- We wish to test the hypotheses that S 
is diagonal, i.e., 

Hq : (Tj j = for all \i — j\ > 1 vs. Ha ■ CTij 7^ for some \i — j\ > 1. (17) 

In the Gaussian case, this is the same as testing for independence. The asymptotic distri- 
bution of L„ can be used to construct a convenient test statistic for testing the hypotheses 
in p!7|) . For example, in the case logp = o(n^/^), an approximate level a test is to reject 
the null hypothesis Hq whenever 

> n"^ ^41ogp — loglogp — log(87r) — 21oglog(l — a)^^^ . 

It follows directly from Theorem [1] that the size of this test goes to a asymptotically as 
n — )• 00. This test was introduced in Cai and Jiang (2010) in the Gaussian case with the 
restriction that logp = o{n^^^). 

Similarly, in the exponential (and sub-exponential) case, set 

Dn,p = nTn + 4 logp - log log p. 

Then Theorem [2] states that 

P {Dn,p < y) ^ 1 - exp (-i^(;g)e(^+8/^)/2) , (18) 

/ g N1/2 

where if(/3) = ( 27r(i-e-"t'») ) " approximate level a test for testing the hypotheses in 
(jl7p can be obtained by rejecting the null hypothesis Hq whenever 

Dn,p < 21oglog(l - a)-i - 2\og K{/3) - 8/3. 

A test for the super-exponential case can also be constructed analogously by using the 
limiting distribution given in Theorem [31 

Compressed sensing is an active and fast growing field in signal processing. See, e.g., 
Donoho (2006), Candes and Tao (2007), Bickel, Ritov and Tsybakov (2009), Candes and 
Plan (2009), and Cai, Wang and Xu (2010a, b). An important problem in compressed sens- 
ing is the construction of measurement matrices X„xp which enables the precise recovery of 
a sparse signal /3 from linear measurements y = X/3 using an efficient recovery algorithm. 
Such a measurement matrix X is typically randomly generated because it is difficult to 
construct deterministically. The best known example is perhaps the n x p random matrix 
X whose entries iid normal variables 

~iV(0,n-i). (19) 

A commonly used condition is the mutual incoherence property (MIP) which requires the 
pairwise correlations among the column vectors of X to be small. Write X = (Xi, • • • , Xp) = 
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{xij)nxp with Xij satisfying (fT9]) and let the coherence = maxi<j<j<p \pij\ be defined as 
in ([3]) and dH). It has been shown that the condition 



ensures the exact recovery of A;-sparse signal (3 in the noiseless case where y = Xf3 (see 
Donoho and Huo (2001) and Fuchs (2004)), and stable recovery of sparse signal in the 
noisy case where 



Here z is an error vector, not necessarily random. See Cai, Wang and Xu (2010b). 

The limiting laws derived in this paper can be used to show how likely a random matrix 
satisfies the MIP condition (I20p . Take the sub-exponential case as an example. By Theorem 
m as long as {log p)/n — )• 0, 



So in order for the MIP condition (j20|) to hold, roughly the sparsity k should satisfy 



4 Technical Tool: Distribution of Correlation Coefficients 

In this section we shall discuss the methodology used in our technical arguments. Sophisti- 
cated approximation methods such as the Chen-Stein method, large deviation bounds and 
strong approximations are the main ingredients in the proofs of the previous results in the 
literature including those given in Jiang (2004), Li and Rosalsky (2006), Zhou (2007), Liu, 
Lin and Shao (2008), Li, Liu and Rosalsky (2009), Li, Qi and Rosalsky (2010), and Cai and 
Jiang (2010). Though these technical tools work well for the cases when the dimension p is 
not ultra high, it is far from clear to us whether/how these same tools can be used to derive 
the limiting distributions of the coherence L„ for the three regimes considered in Section [2j 
In this paper, a different approach is developed to derive the limiting distributions of 
L„. Assuming the Xj's have the spherical distribution, we find an interesting and useful 
property of the correlation coefficients {pij; 1 < i < j < p} and {pij; I < i < j < p} given 
below. 

Lemma 4.1 Let n > 3. Under Assumption (A), the Pearson correlation coefficient 
{Pij'i 1 ^ ^ < J ^ p} ore pairwise independent and identically distributed with density 
function 



{2k - l)Ln < 1 



(20) 



y = X(3 + z. 





f{p) 



1 r(^) 
V^r(^) 



• (1 - P") 



n~-4 
2 



p\<l. 



(21) 
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Similarly, {pij; 1 < i < j < p} are pairwise independent and identically distributed with 
density 

Note that the only difference between (j2ip and (j22p is the "degree of freedom" : replacing 
n in (j22p with n— 1, one gets (j2ip . This is not difficult to understand by noting the definition 
of Pij = I ■ Heuristically, by subtracting Xj from Xj, the distribution of pij 

becomes one degree less than that of pij = j^x~p]\i~\\- 

Although {pij; 1 < i < j < p} are pairwise independent, they are not mutually inde- 
pendent. In fact, recalling ^' = ^f,! = {pij)pxp, the probability density function of ^' is 
given by 

h{^) = Bn,p ■ (det(M'))("-^'-2)/2 (|^^^.| ^i^i^ (23) 

for 1 < p < n, where -Bn,p is an (explicit) normalizing constant, see p. 148 from Muirhead 
(1982). Obviously, h{^) is not a product of functions of individual Pij^s, the entries of ^, 
hence {pij; I < i < j < p} are not independent. 

Lemma 14.11 also yields the following interesting results on the distribution of the corre- 
lation coefficients pij in the small sample cases. The verification is given in Section [H 

Corollary 4.1 Under Assumption (A), the following holds for all 1 < i < j < p. 

(i) . When n = 2, pij has the symmetric Bernoulli distribution, i.e., P{pij = ±1) = 1/2. 

(ii) . When n = 3, pij has the density f{p) = i— ;== on (—1, 1). That is, pfj follows the 

arcsine law on [0, 1] . 

(iii) . When n = 4, pij follows the uniform distribution on [—1, 1]. 

(iv) . When n = 5, pij has the density f{p) = — p^ for \p\ < 1. That is, pij follows the 

semi- circle law. 

Lemma |4. II provides a major technical tool for the proof of the main results. The starting 
step in the proofs of our theorems is the Chen-Stein method (Lemma 16. 3p which requires 
the evaluation of two quantities: P{pij > C) and P{pij > C,pki > C). By using the 
explicit density expression in (j2ip . we are able to evaluate the first probability precisely. 
The pairwise independence stated in Lemma HjT] yields P[pij > C, > C) = P{pij > C)^ 
for 7^ {^,0- 1^ other words, the evaluation of the second quantity is reduced to the 

study of the first one. This greatly simplifies some of the technical arguments. 
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Remark 4.1 Equation (j2ip yields directly that Wn '■= \^pi2 has the density function 

as n — )• oo for all tt; E M, where the fact that T{^^)/T{^^-^) ~ ynj^ as n — )• oo (see 
(j33p ) is used. This shows that VF„ converges to A'^(0, 1) in distribution as n — )• cx). Set 
{xij)nxp '■= (Xi,--- ,Xp). Assuming that Xjj's are i.i.d. with an unknown distribution 
but with suitable moment conditions, say, |3;i2| is bounded, it can be shown easily that 
\/npi2 converges to A^(0, 1) by using the standard central limit theorem for i.i.d. random 
variables and the Slusky theorem. However, the convergence speed is hard to be captured 
well enough so that L„ in ([2]) is understood clearly when p is much larger than n. The best 
known result is that ([5]) holds for logp = o(n") with a = 1/3 in Cai and Jiang (2010). Here, 
with the understanding of the pairwise independence among {pij; 1 < i < j < p} and the 
exact distribution of pij we are able to get the limiting distribution of L„ for the full range 
of the values of p and to fully characterize the phase transition phenomena in the limiting 
behaviors of the coherence (Theorems [U [2] and [3] and the corresponding corollaries) . 

5 Discussions 

The present paper was inspired by the applications in high-dimensional statistics and signal 
processing in which the dimension p is often desired to be as high as possible as a function of 
n. All the known results on the coherence L„ are restricted to the cases where the dimension 
p is either linear, polynomial or at most sub-exponential in n. In comparison, we give in this 
paper a complete characterization of the limiting distribution of L„ for the full range of p 
including the sub-exponential case ^logp — 0, the exponential case Mogp — /3 G (0,oo), 
and the super-exponential case ^ logp — )• oo. Our results show interesting phase transition 
phenomena in the limiting distributions of the coherence when the dimension p grows as a 
function of n. Over the full range of values of p, phase transition of the limiting behavior 
of Ln occurs twice: when logp is of order n^^'^ and when logp is of order n. These results 
also show that the standard limiting distribution ([5]) known in the literature holds if and 
only if logp = o(n^/^) when the columns have a spherical distribution which includes the 
commonly considered i.i.d. normal setting as a special case. 

Previous results on the coherence Ln focus on the case where the entries Xij of the ran- 
dom matrix X are i.i.d. under certain moment conditions. See the references mentioned in 
(a), (b) and (c) in the introduction. In this paper, we assume the columns of X = {xij)nxp 
to be i.i.d. with a spherical distribution. The spherical distribution assumption are more 
special than the non-specified distributions with certain moment conditions considered in 
the previous literature. On the other hand, the entries of a vector with a spherical distribu- 
tion do not have to be independent (see, e.g., the normal scale-mixture distribution in ([6]) 
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and the multivariate i-distribution in ([7])). In this sense, our work relaxes the independence 
assumption among the entries Xij. Under the assumption of spherical distributions, we are 
able to show that the sample correlation coefficients are pairwise independent and then use 
the exact distribution and the pairwise-independence structure of the sample correlation 
coefficients as a major technical tool in the derivation of the limiting distributions. 

There are interesting connections between sample correlation coefficients and angles 
between random vectors. Let a G M" be a deterministic vector with ||a|| = 1. Let Xi G 
be a random vector with a spherical distribution satisfying P(Xi = 0) = 0. Relating 
Theorem 1.5.7(i) and (5) on page 147 in Muirhead (1982), it can be seen that W = has 
the same distribution as the one given in (j22p . Note that ||X7][ uniform distribution 

over the unit sphere in M*^, and hence W is the cosine of the angle between a fixed unit 
vector a and a random vector with the uniform distribution on the unit sphere. Similar to 
Corollary 14.11 the following holds. 

(i) . If n = 2, then the cosine of the angle has the probability density function f{p) = 

^--j==. That is, the square of the cosine follows the Arcsine law on [0, 1]. 

(ii) . If n = 3, then the cosine of the angle follows the uniform distribution on [—1,1]. 

(iii) . If n = 4, then the cosine of the angle has the probability density function f{p) = 

"^^JX — for IpI < 1. That is, p^j follows the semi-circle law. 

The semi-circle law is perhaps best known in random matrix theory as the limit of the 
empirical distribution of the eigenvalues of an n x n Wigner random matrix as n — t- oo. 
See, e.g., Wigner (1958). It seems not so common to see a random variable to satisfy the 
semi-circle law in practice. It is interesting to see the semi-circle law here as the exact 
distribution of the correlation coefficient and the cosine of the angle between two random 
vectors in Corollarv l4.1( iv) and (iii) above. 

6 Proofs 

In this section we prove the main results of the paper. We shall write p for p„ if there 
is no confusion. We begin by proving Lemma 14.11 on the distributions of the correlation 
coefficients. We then collect and prove a few additional technical results before giving the 
proofs of the main theorems. 

6.1 Technical Results 

The following lemma is needed for the proof of Lemma 14.11 
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Lemma 6.1 Let X be an n- dimensional random vector with a spherical distribution and 
P(X = 0) = 0. Let 1 = (1, • • • , 1)^ G W and {1} = {/cl; k G M}, the span of 1. Then 
P(XG{1})=0. 

Proof. Since P(X = 0) = 0, we know Y := -pq- is well-defined. By definition, OX = X 
for any orthogonal matrix O, then 

l|OX|| ||X|| 

That is, the probability measure generated by Y is an orthogonal-invariant measure on the 
unit sphere S*""^ C M". Since the Haar probability measure, as the distribution on the unit 
sphere with the orthogonal-invariant property, is unique, it follows that Y must have the 
uniform distribution on the unit sphere in M". In particular, P(Y = y) = for any y S S""^. 
Let ^ = {X G {1}\{0}} and yo = n^^/^{l, • • • , 1)^ G 5""^ Notice ^ C {Y = yo or - yo}- 
It follows that P(X G {1}) = P{A) < P(Y = yo) + ^'(Y = -yo) = 0. ■ 

Proof of Lemma 14.11 Recall that Xi, • • • ,Xp are independent and pij is the Pearson 
correlation coefficient of Xj and Xj for 1 < i < j < p. Given i < j and k < I with 
(^ij) 7^ {k,l). It is easy to see that pij and p^i are independent if {i,j}f]{k,l} = 0. Thus, 
to finish the proof, it enough to prove the following: 

Let {U, V, W}be i.i.d with an n-dimensional spherical distribution and P(U = 0) = 0. 
Then pu,v a.nd /5u,w are i.i.d. with the density function given in (j2ip . (24) 

By Lemma EH P(U G {!}) = P(V G {1}) = P(W G {1}) = 0. Then, pu,v and pu,w 
have the same probability density function f(p) by (5) on p. 147 from Muirhead (1982). 
To show the independence, we need to prove 

E[g{pi],v) ■ /i(pu,w)] = Eg{p\jy) ■ Eh{pu^w) (25) 

for any bounded and measurable functions g{x) and h{x). Since U, V and W are indepen- 
dent, 

E[g{pv,v)-h{pu^w)] = ^{s[(7(pu,v)-/i(pu,w)|U]} 

= E[E[g{pu^v)\U]-E[h{pu,wm}. (26) 

Write V = [Vi, • • • , Vn)'^ G and V = ^ Yl^=i ^i- ^'^^ any numbers ni, • • • , ti„ such that 
at least two of them are not identical. Theorem 5.1.1 and (5) on p. 147 from Muirhead 
(1982) say that 

Pu,V =' 
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has the probabihty density function f{p) as in (j2T]) . where u = {ui,--- ,Un)'^ and u = 
n Sr=i (^^^ Kariya and Eaton (1977) for this). In other words, given U, the prob- 
abihty distribution of pu,v does not depend on the value of U. Let U = {Ui, • • • , Un)^ ■ 
Evidently, P{Ui = •.• = [/„) = P(U G {1}) = 0. Thus, 



i^b(pu,v)|U] = / g{p)f{p) dp = Eg{pu,^) 
J\p\<i 



'|p|<i 
and 

^[/i(pu,w)|U] =E/i(pu,w) 

since /Ou,v and /Ou,w have the same probability density function /(p) as in ([2T|) . These 
and (1261) conclude (1251). 



We now turn to study pij. Given 1 < i < j < p. Then a := |jxj][ ^ ^^^^ vector and 
is independent of Xj. Further, pij = ^5^- It then follows from Theorem 1.5.7(i) and the 
argument for (5) on p. 147 of Muirhead (1982) that pij has the probability density function 
f{p) as in (j22p . The proof for the pairwise independence among {pij; 1 < i < j < p} is the 
same as that for the pij's. ■ 

Proof of Corollary 14.11 Taking n = 3, 4, 5, respectively, in Lemma I4.H we easily have 
(ii), (iii) and (iv). Now we check (i). 

Let Xi = iCi,m)^ e and X2 = i(,2,mV ^ I^^- It is easy to see 

Pi 2 = 77 7 ■ JT 7- (27) 

First, Assumption (A) and Lemma l6.ll imply P{(,i = = for i = 1,2. Since Xi has 
a spherical distribution, we know that AXi and Xi have the same distribution for any 
A = diag(ei,e2) with = ±1, i = 1,2. This implies Xi is symmetric, and hence ^ — 771 
is symmetric. Consequently, |||~^^ | takes value ±1 with probability 1/2 each. The same 
is true for |^~^^| . By (j27p and the independence between Xi and X2, we conclude that 
P(pi2 = ±l)'=l/2. ■ 

Lemma 6.2 Let t = tm e (0, l) satisfy mt^ —^00 as m 00. Then 

- X^rl^ dx = — {l- t2)(-+2)/2 ^ ^^^^^ 



mt 



as m ^ 00. 



Proof. Set y = x"^ for x > 0. Then x = ^ and 

Im:= f\l-xT'^dx = I [' ±{l-yr/^dy (28) 



1 1 

m + 2 Jt2 y/y 



(i-y) 



{m+2)/2 



dy. 
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By integration by parts, 

1 1 



m+2)/2 



1 ,2N(m+2)/2 



t2 2(m + 2) y3/2 



1 1 



.(l-y)('"+2)/2^y 



(m + 2)t 



1 1 



1-y 



dy. (29) 



Note that < < ^ for all [i^, 1]. By the second equality in ([28 



< 



1 1 



1 



m + 2 2 Jt2 ^ 
This and (1291) conclude that 



— {l-yr/'-^—^dy<^I„ 



1 



(m+2)/2 1 



[m + 2)t ^ mt^ 
Solving the first inequality on 7^, we have 



(m + 2)t 



2N(m+2)/2 



1 + 



1 \-l 1 



1 



By the given condition that mt^ = mt^ — )• oo, we arrive at 



2N(m+2)/2 



as m — )• oo. 



The following Poisson approximation result is essentially a special case of Theorem 1 
from Arratia et al. (1989). 

Lemma 6.3 Let I he an index set and {i?a,a £ 1} be a set of subsets of I, that is, 
Ba C / for each a £ I. Let also {r]a,ct G /} be random variables. For a given t € M, set 
^ = J2aei Pi'^a > t). Then 

|P(max7/„ < *) - e^^l < (1 A A-i)(6i + 62 + ^3) 
06/ 

where 

a&I PeBa ael ay^^&Ba 

63 = Y,E\P{^la > t\air]fs,f3 i B^)) - P(i^^ > t)\, 

and a{r]p,l3 ^ B^) is the a-algebra generated by {'r]i3,f3 ^ B^}- Ln particular, if ija is 
independent of {77^, /3 ^ Ba} for each a, then 63 = 0. 
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Lemma 6.4 Let Ln be as in (0j and Assumption (A) hold. For {tn G [0, 1]; n > 1}, set 

hn = — / (1 ~ ^ ) ^ dx, n> 1. 



^2vr 

//lim^^oo hn = \ G [0,oo), i/ien lim^^oo -P(-^^n < in) = e"^. 

Proof. For brevity of notation, we sometimes write t = t„ if there is no confusion. First, 
take / = {(«, j); 1 < ? < i < p}- For u = {i,j) G /, set Bu = {{k, I) G /; one of k and / = 
i or j, but (A;,/) / u}, rju = \pij\ and Au = Aij = {\pij\ > t}. By the i.i.d. assumption on 
Xi , • • • , Xp and Lemma 16.31 

\P{Ln<t)-e-^"\<bi,n + b2,n (30) 



where 



and 



A„ = ^K^PlAu) (31) 



6l,„ < 2p'P(^i2)' and b^,, < 2p'P(Ai^A„). 
By Lemma [4. H A12 and A13 are independent events with the same probabihty. Thus, from 

bl,n A b2,n < 2p'P{Auf < < ^ (32) 

[p - p 

for all p>2. Now we compute P{Ai2)- In fact, by Lemma |4. II again. 

P{Au)= f{x)dx = ^^T^J (l-x2)^dx 

Vl>|x|>t Jl>\x\>t 

2 r(^^) 2 „-4 , 

RecaUing the Stirling formula (see, e.g., p. 368 from Gamelin (2001) or (37) on p. 204 from 
Ahlfors (1979)): 

logr(z) = z log z — z — - log z + log \/27r + O {—] 

2 \xj 

as X = Re (z) — )■ 00, it is easy to verify that 

r(^) .... 
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as n — ^ oo. Thus, 

P{Ai2) ~ 

as n — )• oo. From (|3ip . we know 



(1 — X ) 2 dx 



^1/2^2 /.I 



(1 - X^) 2 dx = hn 

as n — 7- oo. Finally, by (jSOp and (j32p . we know 

\i-m P{Ln<t) = if lim /i„ = A G [0, oo). ■ 

6.2 Proofs for Results on L„ in Section 12.11 

Proof of Theorem [TJ (i). Assume (ii) of the theorem holds. Since (logp)/n — t- as 
n — )• oo, dividing ([8]) by n, we see that log(l — L^) — )• in probability, or equivalently, 
L„ — 7- in probability as n — t- oo. 

(ii). The proof here does not rely on the conclusion in (i). We claim that 



(n — 2) log(l — L\) + A\ogp — log \ogp 



(34) 



converges weakly to the distribution function F{ij) = 1 — e~^'^^^^ , y G M. Once this holds, 
using the condition that logp = o(n) and the same argument as in (i) above, we have 
log(l — L^) — )• in probability as n — )• oo. Then by the Slusky lemma. 



nlog(l — L^) + 41ogp — loglogp 
converges weakly to the distribution function F{y) = 1 — e~^^^^'^ , 



\. Now we prove 
Fix y e M. Let = n - 2 and t = tn £ [0, 1) such that 



log(l-t') 



-41ogp + log logp + y 
N 



AO. 



y GM. We then obtain 



(35) 



From (I35p and the assumption logp = o(n), we have that tn ^ 0^ as n — >• oo, and hence 
log{l -t^) -t"^. Thus, §5^ implies 



4 logp — log logp — 1/2 2-^logp 
N 



and Nt'^ — )• oo 

as n — )• oo. By (f35]l again, 

P((n-2)log(l-L2) + 41ogp-loglogp>y) = P(L„ <t) 



(36) 



(37) 
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as n is large enough. Now let's compute hn in Lemma 16.41 for lim„_s.oo P{Ln ^ t)- Recall 

„l/2„2 [-1 . 

hn = %^ / {l-x^)-^dx. (38) 



^27r 

From Lemma 16.21 and the second assertion in (j36p , 



as n — )• oo. This joint with (I35p and the first assertion in ([36]) gives 

p2 ^2\A'/2 r-41ogp + loglogp + y 1 

N y} = 2 

as n — )• cxD. Combining the above three identities, we see that 



as n — >• oo. Therefore, we conclude from Lemma 16.41 and p7p that 

lim P((n - 2) log(l - L^) + 41ogp - loglogp > y) = e-^"""'^ 

n— >oo 

for any y € M, where K = -^^^ Since (p{y) := e~^'^^^^ is continuous for all y G M, it is 
trivial to check that 

lim P{{n - 2) log(l - L^) + 41ogp - loglogp < y) = 1 - e'^''''^' (39) 

n— >oo 

for any y G M. We get 1^. U 

Proof of Corollary 12.11 Dividing ([8]) by logp, we see that 

n 



Iog(l-L^)^-4 (40) 
logp 

in probability as n — )• oo. By (i) of Theorem [H we know L„ — )■ in probability as n — )• oo. 
Since pij has density f{p) as in ()2ip for i 7^ j, we have P{Ln = 0) = for all n > 3. Notice 
the function 



h{x) := < 
is continuous on [0, 1), we have 



log(l - x), if X G (0, 1); 
-1, ifx = 



l°^(l-^")=/.(L^)^/.(0) = -l 



LI 
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in probability as n ^ oo. This together with ()40p yields 



n 



■Li 



\ogp 

in probability as n — )• oo. The desired conclusion then follows. 



Proof of Corollary 12.21 By Theorem [H 

P (nlog(l - Ll) + 41ogp - log log p <y) ^ F{y) 
as n — )• oo, where F{y) = 1 — e"-'^^^''^, y G M. Set 

yn,p = n 1 -exp|-^(-41ogp + loglogp + y)| . 
Then, (j3T|) becomes that P{nLf^ > yn,p) — ^ F{y)i hence 

P(?iL^ - 4 log p + log log p < yn,p - 4 log p + log log p) I - F{y) 
as n — 7- oo for any y £M. We claim 

?/n,p - 41ogp + loglogp -(y + 8q^) if a G [0, oo). 

If this is true, by (j43p and the continuity of F{y), 

lim P{nLl - 41ogp + loglogp < -(y + 8a^)) = 1 - F{y) 



for any y G M. In other words, nL^ — 41ogp + log log p converges weakly to a probability 
distribution function 

G{z) := 1 - F{-z - 8a^) = exp{-Ke-(^+S"')/2|^ ^ 



(41) 



(42) 



(43) 



(44) 



as n — )• oo. Now we prove claim 

In fact, set t = — 41ogp + loglogp + y. Then t = O(logp) and ^ — )• as n — >■ oo under 



the assumption 



a. Consequently, by (142^ and the Taylor expansion. 



yn,p = n(l - e*/") 



-n 



t 



t- — + 0{^ 



2n 



as n — )■ oo. If — ;> a as n — )■ oo, then |^ — ;> 8a^ and ^ — ;> as n — )■ oo. Therefore, 
is concluded. ■ 
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Proof of Theorem [2j (i). Assume (ii) of the theorem holds. Since {log p)/n — )• /3 as 
n — )• oo, dividmg pT]) by n, we see that log(l — L^) — )■ —4/3 in probabihty, or equivalently, 
Ln — Vl — e~^^ in probabihty as n — t- oo. 

(ii) . The proof here does not rely on the conclusion in (i) . We first show that 

(n - 2)log(l - LI) + 41ogp - loglogp (45) 

converges weakly to the distribution function F{y) = 1 — e~^^^^'^^^'^ , y G M, where K(f3) is 
as in ([T2]). If this is true, by the condition (log p)/n — )• /3 and the argument as in (i) above, 
we see that 

log(l - Ll) ^ -4/3 

in probability as n — )• oo. Thus, by the Slusky lemma, 

nlog(l — L^) + 41ogp — log log p 
= [(n-2)log(l-L2) + 41ogp-loglogp] +21og(l-L2) 

converges weakly to the distribution function F{y) = 1 — q-^W^^'"^^^^^^ ^ y E M. We now 
prove ([I5]) . 

Fix y G M. Let iV = n - 2 and t = e [0, 1) such that 

= 1 — exp | — (— 41ogp + loglogp + y) A o|. 

It is easy to see that 

P((n - 2) log(l - Ll) + 41ogp - loglogp > y) = P{Ln < t) (46) 
as n is sufficiently large, and 



lim tn = e-^P G (0, 1) and iVlog(l - t^) = -41ogp + loglogp + y (47) 

n— >oo 

as n is sufficiently large. We now calculate hn in Lemma 16.41 to obtain lim.„_>.oo P{Ln < t). 
Review 

^1/2^2 />! 

K = '^ / (l-x2)^dx. (48) 



^27r 

It follows from Lemma 16.21 and the first identity in (1470 that 

Al-x2)("-4)/2dx ~ !^(l_i2){n-2)/2 

1 [i-ty/' 
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as n — 7- oo. By using the second identity in (j47p . we see that 

(, .2\N/2 _ |--41ogp + loglogp + ?/ N- 



N ^ ^ VN ^ N 



IN 

as n — 7- oo. Cohect aU the facts above to have 

hm K = K{P)ey/^ 

where 



By (|46p then Lemma 16.41 we have 

hm P((n - 2) log(l - Ll) + 41ogp - loglogp >y) = e'^^'^)^"^" 
for any y G M. By the same argument as getting ()39p . the above yileds that 
hm P((n- 2)log(l - L^) + 41ogp - loglogp < y) = 1 - g-^^'^)^''^" 

n— >-oo 

for any y G M. We eventuahy arrive at ()45p . ■ 

Proof of Theorem [5J (i). Assuming (ii) of the theorem, dividing <\V6^ by logp, we see 
that 

^ ,log(l-L2)^-4 



logp 

in probabihty as n — ?• oo. Since (logp)/n — t- oo, we have L„ — t- 1 in probabihty as n — ?• oo. 

(ii). The proof in this part does not rely on the conclusion in (i). Fix y G M. Let 
N = n — 2 and t = tn>^ such that 



= 1- exp |-^(-41ogp + logn + y) A o|. 



Obviously, t„ — ?• 1 as n — t- oo by the condition (logp)/n — t- oo. Thus, without loss of 
generality, assume t = tn ^ (0, 1) for all n > 1. Easily, 

log(l - t') = -4^°SP + ^°g" + ^ and (49) 
P((n - 2) log(l - Ll) + 41ogp - log n > y) = P(L„ < t) (50) 

as n is sufficiently large. We now evaluate /i„ in Lemma 16.41 to obtain lim„_>.oo P{Ln < t). 
Recall 



K= J- / {l-x^)—dx. (51) 
V 27r 
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From Lemma 16.21 and the fact t„ — t- 1 as n ^ oo we obtain 



n 



VV/\l-.2)(n-4)/2^^ ^ !^(,_,2){n-2)/2 



/N 

as n — )• oo. Combine this and (1491) to have 



/, ,2^^/2 f -41ogp + logn + y iV' 

' 1 — t j ~ ■ exp ■■' 



gj//2 . yj^ _^ gj//2 



AT ^ ^ ViV I- 

n 
iV 

as n — )• oo. Joining all the above we have that 

hm K = ^=e^/2 

n-5>oo ^27r 

as n — 7- OO. From (I50p then Lemma 16.41 we finally obtain 

lim P((n - 2) logfl - LI) + 41ogp - logn > y) = g-^'^"^" 

n— >oo 

for any y E M, where K = By the same argument as getting (j39|) . the above actually 
implies that 

lim P((n - 2) log(l - ) + 4 log p - log n < y) = 1 - e"^''''^" 

n— >oo 

for any y G M. This says that 

(n-2)r„ + 41ogp-logn^F(2/) (52) 

with F{y) = 1 — e~^*^^^^, y G M and K = l/\/27r. Further, multiplying the left hand side 
of (f52]) by we obtain 

2T„ + ^ (53) 
n — 2 

as n — )• oo. Noticing (n — 2)Tn + 2T„ = nT^. Adding up ([52|l and (f53]) . we conclude from 
the Slusky lemma that 

m ^ 1 1 8 log p ^ 4n ^ 

nTn + 4 logp - log n H = nT„ H logp - log n 

n — 2 n — 2 

converges weakly to the distribution function F{y) = 1 — e"^'^'"'^, ?/ G R with K = \j\f2/K. 
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6.3 Proofs for Results on L„ in Section 12.21 

The proofs of the results on L„ are analogous to those of the results on The essential 
difference is to apply (j22|) in place of (j2ip . Keeping all other arguments, we then get the 
proofs of the results on L„, stated in Section 12. 2i We omit the details for reasons of space. 
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